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^ ■ Abstract 

We consider the problem of computing numerical invariants of programs by 
abstract interpretation. Our method eschews two traditional sources of imprecision: 
(i) the use of widening operators for enforcing convergence within a finite number of 
iterations (ii) the use of merge operations (often, convex hulls) at the merge points of 
I the control flow graph. It instead computes the least inductive invariant expressible 

. in the domain at a restricted set of program points, and analyzes the rest of the 

I code en bloc. We emphasize that we compute this inductive invariant precisely. 

O . For that we extend the strategy improvement algorithm of Gawlitza and Seidl llT"]. 

If we applied their method directly, we would have to solve an exponentially sized 
system of abstract semantic equations, resulting in memory exhaustion. Instead, 
^ ' we keep the system implicit and discover strategy improvements using SAT modulo 

real linear arithmetic (SMT). For evaluating strategies we use linear programming. 
Our algorithm has low polynomial space complexity and performs for contrived 
. examples in the worst case exponentially many strategy improvement steps; this 

is unsurprising, since we show that the associated abstract reachability problem is 
Ilj-complete. 

1 Introduction 

^ ■ Motivation Static program analysis attempts to derive properties about the run-time 

behavior of a program without running the program. Among interesting properties are 
the numerical ones: for instance, that a given variable x alvi^ays has a value in the range 
[12, 41] vi^hen reaching a given program point. An analysis solely based on such interval 



relations at all program points is known as interval analysis llj . More refined numerical 
analyses include, for instance, finding for each program point an enclosing polyhedron 
for the vector of program variables [13]. In addition to obtaining facts about the values 
of numerical program variables, numerical analyses are used as building blocks for e.g. 
pointer and shape analyses. 
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However, by Rice's theorem, only trivial properties can be checked automatically 
26( 1 ■ In order to check non-trivial properties we are usually forced to use abstractions. 



A systematic way for inferring properties automatically w.r.t. a given abstraction is 
given through the abstract interpretation framework of Cousot and Cousot [13] • This 
framework safely over- approximates the run-time behavior of a program. 

When using the abstract interpretation framework, we usually have two sources 
of imprecision. The first source of imprecision is the abstraction itself: for instance, 
if the property to be proved needs a non-convex invariant to be established, and our 
abstraction can only represent convex sets, then we cannot prove the property. Take 
for instance the C-code y = 0; if (x — 1 || x >= 1) { if (x 0) y = 1; }. No matter 
what the values of the variables x and y are before the execution of the above C-code, 
after the execution the value of y is 0. The invariant |x| > 1 in the "then" branch is not 
convex, and its convex hull includes x = 0. Any static analysis method that computes 
a convex invariant in this branch will thus also include y = 1. In contrast, our method 
avoids enforcing convexity, except at the heads of loops. 

The second source of imprecision are the safe but imprecise methods that are used 
for solving the abstract semantic equations that describe the abstract semantics: such 
methods safely over- approximate exact solutions, but do not return exact solutions in all 
cases. The reason is that we are concerned with abstract domains that contain infinite 
ascending chains, in particular if we are interested in numerical properties: the complete 
lattice of all n-dimensional closed real intervals, used for interval analysis, is an example. 
The traditional methods are based on Kleene fixpoint iteration which (purely applied) 
is not guaranteed to terminate in interesting cases. In order to enforce termination 
(for the price of imprecision) traditional methods make use of the widening/narrowing 
approach of Cousot and Cousot Ij]. Grossly, widening extrapolates the first iterations 
of a sequence to a possible limit, but can easily overshoot the desired result. In order to 
avoid this, various tricks are used, including "widening up to" [27|, Sec. 3.2], "delayed" or 
with "thresholds" 0]. However, these tricks, although they may help in many practical 
cases, are easily thwarted. Gopan and Reps [isl] proposed "lookahead widening", which 
discovers new feasible paths and adapts widening accordingly; again this method is 
no panacea. Furthermore, analyses involving widening are non-monotonic: stronger 
preconditions can lead to weaker invariants being automatically inferred; a rather non- 
intuitive behaviour. Since our method does not use widening at all, it avoids these 
problems. 



Our Contribution We fight both sources of imprecision noted above: 

• In order to improve the precision of the abstraction, we abstract sequences of 
if-then-else statements without loops en bloc. In the above example, we are then 
able to conclude that y holds. In other words: we abstract sets of states only 
at the heads of loops, or, more generally, at a cut-set of the control- flow graph (a 
cut-set is a set of program points such that removing them would cut all loops). 

• Our main technical contribution consists of a practical method for precisely com- 
puting abstract semantics of affine programs w.r.t. the template linear constraint 
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domains of Sankaranarayanan et al. [42], with sequences of if-then-else state- 
ments which do not contain loops abstracted en bloc. Our method is based on a 
strict generalization of the strategy improvement algorithm of Gawlitza and Seidl 



13) llSl . l2l| . The latter algorithm could be directly applied to the problem we 
solve in this article, but the size of its input would be exponential in the size 
of the program, because we then need to explicitly enumerate all program paths 
between cut-nodes which do not cross other cut-nodes. In this article, we give an 
algorithm with low polynomial memory consumption that uses exponential time 
in the worst case. The basic idea consists in avoiding an explicit enumeration of 
all paths through sequences of if-then-else-statements which do not contain loops. 
Instead we use a SAT modulo real linear arithmetic solver for improving the cur- 
rent strategy locally. For evaluating each strategy encountered during the strategy 
iteration, we use linear programming. 
• As a byproduct of our considerations we show that the corresponding abstract 
reachability problem is n2-complete. In fact, we show that it is n2-hard even if 
the loop invariant being computed consists in a single x < C inequality where x 
is a program variable and C is the parameter of the invariant. Hence, exponential 
worst-case running-time seems to be unavoidable. 

Related Work Recently, several alternative approaches for computing numerical in- 
variants (for instance w.r.t. to template linear constraints) were developed: 
Strategy Iteration Strategy iteration (also called policy iteration) was introduced by 



Howard for solviiig stochastic control problems 29|, |40(] and is also applied to two-players 
zero-sum games 28j_3^, 45 1 or min-max-plus systems Adje et al. 0], Costan et al. 
[1], Gaubert et al. |l6l] developed a strategy iteration approach for solving the abstract 
semantic equations that occur in static program analysis by abstract interpretation. 
Their approach can be seen as an alternative to the traditional widening/narrowing 
approach. The goal of their algorithm is to compute least fixpoints of monotone self- 
maps /, where f{x) = min {7r(x) | vr G H} for all x and H is a family of self-maps. 
The assumption is that one can efficiently compute the least fixpoint //vr of vr for ev- 
ery vr G H. The tt's are the (min-)strategies. Starting with an arbitrary min-stratgy 
vr^'^^ the min-strategy is successively improved. The sequence (vr^'^'^)^ of attained min- 
strategies results in a decreasing sequence fin^'^^ > fiir^^^ > ■ ■ ■ > fiir^''^ that stabilizes, 
whenever /xtt^'^^ is a fixpoint of / — not necessarily the least one. However, there are 
indeed important cases, where minimality of the obtained fixpoint can be guaranteed 
[H. Moreover, an important advantage of their algorithm is that it can be stopped at 
any time with a safe over-approximation. This is in particular interesting if there are 
infinitely many min-strategies [3]. Costan et al. [9i] showed how to use their framework 
for performing interval analysis without widening. Gaubert et al. |l6ll extended this 
work to the following relational abstract domains: The zone domain [33|. the octagon 



domain [34| and in particular the template linear constraint domains [42]. Gawlitza 
and Seidl [iTl] presented a practical (max-)strategy improvement algorithm for com- 
puting least solutions of systems of rational equations. Their algorithm enables them 
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to perform a template linear constraint analysis precisely — even if the mappings are 
not non-expansive. This means: Their algorithm always computes least solutions of 
abstract semantic equations — not just some solutions. 



Acceleration Techniques Gonnord [23], Gonnord and Halbwachs [2J] investigated an 
improvement of linear relation analysis that consists in computing, when possible, the 
exact (abstract) effect of a loop. The technique is fully compatible with the use of 
widening, and whenever it applies, it improves both theprecision and the performance 
of the analysis. Gawlitza et al. [20], Leroux and Sutre SlJ studied cases where interval 
analysis can be done in polynomial time w.r.t. a uniform cost measure, where memory 
accesses and arithmetic operations are counted for 0{1). 

Quantifier Elimination Recent improvements in SAT/SMT solving techniques have 



made it possible to perform quantifier elimination on larger formulas 36(]. Monniaux 



37l | developed an analysis method based on quantifier elimination in the theory of 



rational linear arithmetic. This method targets the same domains as the present article; 
it however produces a richer result. It can not only compute the least invariant inside 
the abstract domain of a loop, but also express it as a function of the precondition of the 
loop; the method outputs the source code of the optimal abstract transformer mapping 
the precondition to the invariant. Its drawback is its high cost, which makes it practical 
only on small code fragments; thus, its intended application is modular analysis: analyze 
very precisely small portions of code (functions, modules, nodes of a reactive data-flow 
program, . . . ), and use the results for analyzing larger portions, perhaps with another 
method, including the method proposed in this article. 

Mathematical Programming Colon et al. j^, Cousot 10], Sankaranarayanan et al. 41] 
presented approaches fo r g enerating linear invariants that uses non-linear constraint 
solving. Leconte et al. |30| propose a mathematical programming formulation whose 
constraints define the space of all post-solutions of the abstract semantic equations. 
The objective function aims at minimizing the result. For programs that use affine 
assignments and affine guards, only, this yields a mixed integer linear programming 
formulation for interval analysis. The resulting mathematical programming problems 
can then be solved to guaranteed global optimality by means of general purpose branch- 
and-bound type algorithms. 



2 Basics 

Notations B = {0, 1} denotes the set of Boolean values. The set of real numbers is 
denoted by M. The complete linearly ordered set M U {—00,00} is denoted by M. We 
call two vectors x, y G M comparable iS x < y or y < x holds. For / : X — > R with 
X cW,we set dom(/) ■= {x £ X \ f{x) G M™} and fdom(/) := dom(/) n M". We 
denote the i-th row (resp. the j'-th column) of a matrix A by Ai. (resp. A.j). Accordingly, 
Ai.j denotes the component in the i-th row and the j-th column. We also use this 
notation for vectors and mappings f : X ^ Y'^. 

Assume that a fixed set X of variables and a domain D is given. We consider 
equations of the form x = e, where x G X is a variable and e is an expression over D. 
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A system £ of (fixpoint) equations is a finite set {xi = ei, . . . ,Xn = en} of equations, 
where xi,...,Xn are pairwise distinct variables. We denote the set {xi,...,Xn} of 
variables occurring in £ by X^. We drop the subscript whenever it is clear from the 
context. 

For a variable assignment /? : X — )■ B, an expression e is mapped to a value JeJ/o 
by setting [xj/? := /)(x) and [/(ei, . . .,ek)}p := fUeijp, lekjp), where x G X, / is 
a k-ary operator, for instance +, and ei, . . . ,6^ are expressions. Let £^ be a system of 
equations. We define the unary operator [£']] on X — )• D by setting ([[£']]p)(x) := [ejp 
for all X = e G A solution is a variable assignment p such that p = l^£}p holds. The 
set of solutions is denoted by So^f"). 

Let D be a complete lattice. We denote the least upper bound and the greatest lower 
bound of a set X C D by V A -^^ respectively. The least element V (resp. the 

greatest element /\ 0) is denoted by _L (resp. T). We define the binary operators V and 
A by x\/y := \/{x, y} and xAy := /\{x, y} for all x,y GO, respectively. For □ G {V, A}, 
we will also consider xi □ • • • □ as the application of a /c-ary operator. This will 
cause no problems, since the binary operators V and A are associative and commutative. 
An expression e (resp. an equation x = e) is called monotone iff all operators occurring 
in e are monotone. 

The set X ^ B of all variable assignments is a complete lattice. For p, p' : X ^ B, 
we write p <\ p' (resp. p > p') iff p{x) < p'(x) (resp. p{x) > p'(x)) holds for all x G X. 
For d G B, d denotes the variable assignment {x i— )■ d | x G X}. A variable assignment 
p with _L < p < T is called finite. A pre-solution (resp. post-solution) is a variable 
assignment p such that p < {SJp (resp. p > ISJp) holds. The set of all pre-solutions 
(resp. the set of all post-solutions) is denoted by PreSol(£^) (resp. PostSol(£^)). The 
least fixpoint (resp. the greatest fixpoint) of an operator / : B — )• B is denoted by pf 
(resp. vf), provided that it exists. Thus, the least solution (resp. the greatest solution) 
of a system £ of equations is denoted by pl£} (resp. z^J^"]]), provided that it exists. 
For a pre-solution p (resp. for a post-solution p), p>pl£} (resp. z^<p[[if]) denotes the 
least solution that is greater than or equal to p (resp. the greatest solution that is less 
than or equal to p). From Knaster-Tarski's fixpoint theorem we get: Every system £ 
of monotone equations over a complete lattice has a least solution pl^£} and a greatest 
solution iyl£}. Furthermore, pl£} = A PostSo^f") and vl£} = V PreSo^f"). 



Linear Programming We consider linear programming problems (LP problems for 
short) of the form sup {c^ x \x £W,Ax < b}, where A G M"'^", b G M™, and c G M" 
are the inputs. The convex closed polyhedron {x G M" | Ax < b} is called the feasible 
space. The LP problem is called infeasible iff the feasible space is empty. An element of 
the feasible space, is called feasible solution. A feasible solution x that maximizes c^x 
is called optimal solution. 

LP problems can be solved in polynomial time through interior point methods 



321 . |43| . Note, however, that the running-time then crucially depends on the sizes 



of occurring numbers. At the danger of an exponential running-time in contrived cases, 
we can also instead rely on the simplex algorithm: its running-time is uniform, i.e.. 
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independent of the sizes of occurring numbers (given that arithmetic operations, com- 
parison, storage and retrieval for numbers are counted for 0(1)). 



SAT modulo real linear arithmetic The set of SAT modulo real linear arithmetic 
formulas ^ is defined through the grammar e ::= c \ x \ ei + 62 \ c ■ e' , $::=a| 
ei < 62 I $1 V $2 I '^i A <&2 1 Here, c G M is a constant, a; is a real valued variable, 
e,e', 61,62 are real- valued linear expressions, a is a Boolean variable and $1, <^2 

are formulas. An interpretation I for a formula $ is a mapping that assigns a real value 
to every real-valued variable and a Boolean value to every Boolean variable. We write 
I ^ $ for "/ is a model of i.e., {cjl = c, {xj = I{x), [61 + 62}! = {eijl + [62I/, 
[c • e'jl = c ■ le'jl, and: 

I^a ^ I{a) = 1 I ^ ei < 62 ^ [61I/ < [62I/ 

I \= ^iV ^2 I \= or I \= ^2 / 1= A <1>2 <^=^ I \= ^1 and I \= ^2 

l\=W / ^ 

A formula is called satisfiable iff it has a model. The problem of deciding, whether or 
not a given SAT modulo real linear arithmetic formula is satisfiable, is NP-complete. 
There nevertheless exist efficient solver implementations for this decision problem [l5| . 

In order to simplify notations we also allow matrices, vectors, the operations >, 
<, >, 7^, =, and the Boolean constants and 1 to occur. 

Collecting and Abstract Semantics The programs that we consider in this article 
use real- valued variables xi, . . . , x^j. Accordingly, we denote by x = {xi, . . . ,Xn)~^ the 
vector of all program variables. For simplicity, we only consider elementary statements 
of the form x := Ax + b, and Ax < b, where A G M"^" (resp. M'^^"), 5 G M" (resp. 
M^), and x G M"' denotes the vector of all program variables. Statements of the form 
X := Ax + b are called (affine) assignments. Statements of the form Ax < b are called 
(affine) guards. Additionally, we allow statements of the form si; • • • ; and si | ■ ■ ■ | Sfc, 
where si, . . . ,Sk are statements. The operator ; binds tighter than the operator |, and 
we consider ; and | to be right-associative, i.e., si | S2 I S3 stands for si \ {s2 \ S3), and 
sii S2', S3 stands for si; (§2; ■S3). The set of statements is denoted by Stmt. A statement 
of the form si \ ■ ■ ■ \ Sk, where si does not contain the operator | for all z = 1, . . . , A;, is 
called merge-simple. A merge-simple statement s that does not use the | operator at all 
is called sequential. A statement is called elementary iff it neither contains the operator 
I nor the operator ;. 

The collecting semantics [sj : 2'^" 2'^" of a statement s G Stmt is defined by 

Ix := Ax + bjX ■.= {Ax + b\xG X}, {Ax < bjX := {x £ X \ Ax < b}, 

Isi; ■■■■,Skj:= Iskj o • • • o [sil [si | • • • | s^jX := {s^jX U • • • U [s JA 

for X C M". Note that the operators ; and | are associative, i.e., [(si; S2); 53]] = 
I^i; (s2;'S3)l and [(si | S2) \ S3]] = Isi \ (s2 | 53)! hold for all statements si, 52,53. 

An (affine) program G is a triple {N, E, st), where A is a finite set of program points, 
E Q N X Stmt X A is a finite set of control-flow edges, and st G A is the start program 
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fxi := 
xi := —Zxi X.XI := —xi 

Cr 



X2 < -1 




2^2 > 







xi < 1000; a:;2 := —xi; 
{x2 < -l;xi := -2xi I X2 > 0;xi := 

(b) 



-xi + 1) 



Figure 1: 

point. As usual, the collecting semantics ^ of a program G = {N, E, st) is the least 
solution of the following constraint system: 



V[st] 2 



V[v] D [sKV[u]) for all {u, s,v) e E 



Here, the variables V[f], v € N take values in 2^*" . The components of the collecting 
semantics V are denoted by V[v] for all v £ N. 

Let D be a complete lattice (for instance the complete lattice of all re-dimensional 
closed real intervals). Let the partial order of B be denoted by <. Assume that a : 
2*" ^- B and 7 : D ^ 2^" form a Galois connection, i.e., for all X C M" and all d G B, 
a{X) < d iS X C 7((i). The abstract semantics [sj" : D ^ B of a statement s is defined 
by [si" := a o |s]] o 7. The abstract semantics of an affine program G = {N, E, st) is 
the least solution of the following constraint system: 



V''[st] > a{ 



V"H > Is]"(V"[u]) for aU {u,s,v) €E 



Here, the variables V^[f], v £ N take values in B. The components of the abstract 
semantics are denoted by V^v] for all v G N. The abstract semantics V'^ safely 
over-approximates the collecting semantics V, i.e., 7(^''['u]) 5 V^[f] for all v £ N. 



Using Cut-Sets to improve Precision Usually, only sequential statements (these 
statements correspond to basic blocks) are allowed in control flow graphs. However, 
given a cut-set C, one can systematically transform any control flow graph G into an 
equivalent control flow graph G' of our form (up to the fact that G' has fewer program 
points than G) with increased precision of the abstract semantics. However, for the 
sake of simplicity, we do not discuss these aspects in detail. Instead, we consider an 
example: 

Example 1 (Using Cut-Sets to improve Precision). As a running example throughout 
the present article we use the following C-code: 

int x_l , x_2 ; x_l = 0; while (x_l <= 1000) { x_2 = -x_l ; 
if (x_2 < 0) x_l = -2 * x_l ; else x_l ^ -x_l + 1; } 

This C-code is abstracted through the affine program Gi = {Ni, Ei,st) which is shown 
in Figure [TJ (a) . However, it is unnecessary to apply abstraction at every program point; 
it suffices to apply abstraction at a cut-set of Gi . Since all loops contain program point 
1, a cut-set of Gi is {1}. Equivalent to applying abstraction only at program point 1 
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is to rewrite the control-flow graph w.r.t. the cut-set {1} into a control-flow graph G 
equivalent w.r.t. the collecting semantic. The result of this transformation is drawn in 
Figure [ll(b). This means: the affine program for the above C-code is G = (A^, i?, st), 
where N = {st,l},^ = {(st,a;i := 0, 1), (l,s,l)}, and 

s' = xi < 1000; X2 := —xi si = X2 < — := —2xi 

S2 = -X2 < 0; xi := -xi + 1 s = s'; {si \ S2) 

Let Vi denote the collecting semantics of Gi and V denote the collecting semantics of 
G. Gi and G are equivalent in the following sense: V[v] = Vi[v] holds for all program 
points V N. W.r.t. the abstract semantics, G is, is we will see, strictly more precise 
than Gi. In general we at least have V'^lv] C Vj". [f] for all program points v G N. This 
is independent of the abstract domainj^ □ 



Template Linear Constraints In the present article we restrict our considerations 
to template linear constraint domains [13]. Assume that we are given a fixed template 
constraint matrix T € M*"^". The template linear constraint domain is M .As shown 



by Sankaranarayanan et al. 4^, the concretization 7 : M —7-2 and the abstraction 
a : 2^" M , which are defined by 

7((i) := {x G M" I Ta; < d} S M^, 

a{X) := A{d G I j{d) ^ X} VX C M", 

form a Galois connection. The template linear constraint domains contain intervals, 



zones, and octagons, with appropriate choices of the template constraint matrix |42l ]. 

In a first stage we restrict our considerations to sequential and merge-simple state- 
ments. Even for these statements we avoid unnecessary imprecision, if we abstract such 
statements en bloc instead of abstracting each elementary statement separately: 

Example 2. In this example we use the interval domain as abstract domain, i.e., our com- 
plete lattice consists of all n-dimensional closed real intervals. Our affine program will 
use 2 variables, i.e., n = 2. The complete lattice of all 2-dimensional closed real intervals 
can be specified through the template constraint matrix T = (— / /) € M^^^, where / 
denotes the identity matrix. Consider the statements si = X2 ■= xi, 82 = xi := x\ —X2, 
and s = s\;s2 and the abstract value I = [0, 1] x M (a 2-dimensional closed real interval). 
The interval I can w.r.t. T be identified with the abstract value (0, 00, 1, 00)'''. More gen- 
erally, w.r.t. T every 2-dimensional closed real interval [/i,ni] x [/2,t*2] can be identified 
with the abstract value {—l\,—l2^u\,U2)^. If we abstract each elementary statement 
separately, then we in fact use [52]]'' o [sij" instead of \s^ to abstract the collecting se- 
mantics [si of the statement s = si;s2- The following calculation shows that this can be 
important: {sfl = [0, 0] x [0, 1] ^ [-1, 1] x [0, 1] = [s2l»([0, 1] x [0, 1]) = ([sa]" o 
The imprecision is caused by the additional abstraction. We lose the information that 
the values of the program variables xi and X2 are equal after executing the first state- 
ment. □ 



^We assume that we have given a Galois-connection and thus in particular monotone best abstract 
transformers. 
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Another possibility for avoiding unnecessary imprecision in the above example would 
consist in adding additional rows to the template constraint matrix. Although this 
works for the above example, it does not work in general, since still only convex sets 
can be described, but sometimes non-convex sets are required (cf. with the example in 
the introduction). 

Provided that s is a merge-simple statement, IsJ'^d can be computed in polynomial 
time through linear programming: 

Lemma 3 (Merge-Simple Statements). Let s be a merge-simple statement and d G M"*. 
Then {sfd can be computed in polynomial time through linear programming. □ 

However, the situation for arbitrary statements is significantly more difficult, since, by 
reducing SAT to the corresponding decision problem, we can show the following: 

Lemma 4. The problem of deciding, whether or not, for a given template constraint 
matrix T, and a given statement s, [[sj^oo > — oo holds, is NP-complete. 

Before proving the above lemma, we introduce V-strategies for statements as follows: 

Definition 1 (V-Strategies for Statements). A V-strategy a for a statement s is a 
function that maps every position of a [-statement, (a statement of the form sq | si) 
within s to or 1. The application scr of a V-strategy o" to a statement s is inductively 
defined by sa = s, (sq \ si)a = s<^{pos(so|si))0-, and {so;si)a = {soa;sia), where s is 
an elementary statement, and so,si are arbitrary statements. For all occurrences s' , 
pos(s') denotes the position of s', i.e., pos(s') identifies the occurrence. □ 

Proof. Firstly, we show containment in NP. Assume [sj^oo > — oo . There exists some 
k such that the k-th component of [[sj^'oo is greater than — oo. We choose k non- 
deterministically. There exists a V-strategy a for s such that the k-th component 
of [scr] ^oo equals the k-th component of [[sj^oo. We choose such a V-strategy non- 
deterministically. By Lemma [3l we can check in polynomial time, whether the A:-th 
component of [scjI|''oo is greater than — oo. If this is fulfilled, we accept. 

In order to show NP-hardness, we reduce the NP-hard problem SAT to our problem. 
Let $ be a propositional formula with n variables. W.l.o.g. we assume that ^ is in 
normal form, i.e., there are no negated sub-formulas that contain A or V. We define 
the statement s{^) that uses the variables of ^> as program variables inductively by 
s(z) := z=l, s(z) := z = 0, s(^>i A$2) := s{<^i); s{<^2), and s($i V$2) := s(^i) | s($2), 
where z is a variable of ^, and ^1,^2 are formulas. Here, the statement Ax = 6 is an 
abbreviation for the statement Ax < b; —Ax < —b. The formula ^ is satisfiable iff 
|s($)]M"' 7^ holds. Moreover, even if we just use the interval domain, [[s(<I>)]]M" 7^ 
holds iff [fs(<&)l''oq > -00 holds. Thus, <I> is satisfiable iff [[s(<^)l''oo > -00 holds. □ □ 

Obviously, [(si | S2);sl| = |Isi;s | S2;s} and [s; (si | S2)] = is; si \ s;s2l for all state- 
ments s,si,S2- We can transform any statement s into an equivalent merge-simple 
statement s' using these rules. We denote the merge-simple statement s' that is ob- 
tained from an arbitrary statement s by applying the above rules in some canonical way 
by [s]. Intuitively, [s] is an explicit enumeration of all paths through the statement s. 
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Lemma 5. For every statement s, [s] is merge- simple, and [sj = |[s]]. The size of [s] 
is at most exponential in the size of s. □ 



However, in the worst case, the size of [s] is exponential in the size of s. For the statement 

s = (sj^^ I sf^);--- ;(4^^ I 4^^) , for instance, we get [s] = |(ai,...,afc)G{i,2}fc'sS"' V • • ;4'''''- 
After replacing all statements s with [s] it is in principle possible to use the methods 
of Gawlitza and Seidl [l7^ in order to compute the abstract semantics precisely. 
Because of the exponential blowup, however, this method would be impractical in most 
cases. H 

Our new method that we are going to present avoids this exponential blowup: in- 
stead of enumerating all program paths, we shall visit them only as needed. Guided by 
a SAT modulo real linear arithmetic solver, our method selects a path through s only 
when it is locally profitable in some sense. In the worst case, an exponential number of 
paths may be visited (Section [7]); but one can hope that this does not happen in many 
practical cases, in the same way that SAT and SMT solving perform well on many 
practical cases even though they in principle may visit an exponential number of cases. 



Abstract Semantic Equations The first step of our method consists of rewriting 
our program analysis problem into a system of abstract semantic equations that is 
interpreted over the reals. For that, let G = {N,E, st) be an affine program and 

its 

abstract semantics. We define the system C{G) of abstract semantic inequalities to be 
the smallest set of inequalities that fulfills the following constraints: 

• C contains the inequality Xst,j > ai.(M") for every i G {1, . . . ,m}. 

• C contains the inequality x„_j > [[s]]^.(xu^i, . . . ,:Ku^m) for every control-flow edge 
(n, s,v) G E and every i G {1, . . . , m}. 

We define the system £{G) of abstract semantic equations by £{G) := 8{C{G)). Here, for 
a system C = {xi > ei,i, . . . ,xi > ei,fe^, . . . ,x„ > e^.i, . . . ,x„ > e„,A;„} of inequalities, 
£{C') is the system £{C') = {xi = ei^iV- • -Vei^fc^, . . . ,Xn = e„_iV- • -VCn^fc^} of equations. 
The system £{G) of abstract semantic equations captures the abstract semantics of 
G: 

Lemma 6. {V^[v])^. = /i|<?(G)Kx„,i) for all program points v, i ^ {1, . . . , m}. □ 

Example 7 (Abstract Semantic Equations). We again consider the program G of Exam- 
pledJ Assume that the template constraint matrix T E M^^^ is given by Ti. = (1,0) and 
T2. = (-1,0). Let denote the abstract semantics of G. Then V^[l] = (2001, 2000)^ . 
£{G) consists of the following abstract semantic equations: 

Xst,i = 00 xi,i = {xi := 0]lf.(xst,i,Xst,2) V [[s]5.(xi,i,xi,2) 

Xst,2 = 00 xi,2 = := Oj^. (xst,i' Xst,2) V [[s]|.(xi,i, xi,2) 

^ Note that we cannot expect a polynomial-time algorithm, because of Lemma 3] even without 
loops, abstract reachability is NP-hard. Even if all statements are merge-simple, we cannot expect 
a polynomial-time algorithm, since the problem of computing the winning regions of parity games is 
polynomial-time reducible to abstract reachability [l^ . 
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As stated by Lemma El we have (F"!!])!. = /iI[£'(G)](xi,i) = 2001, and {V^[l])2- = 
//[f (G)Kxi,2) = 2000. ' □ 



3 A Lower Bound on the Complexity 

In this section we show that the problem of computing abstract semantics of affine 
programs w.r.t. the interval domain is Ilg-hard. Hg-hard problems are conjectured to be 
harder than both NP-complete and co-NP-complete problems. For further information 



regarding the polynomial-time hierarchy see e.g. Stockmeyer 44 ]. 



Theorem 8. The problem of deciding, whether, for a given program G, a given template 
constraint matrix T , and a given program point v, V^[v] > — oo holds, is I\^-hard. 

Proof. We reduce the n^-complete problem of deciding the truth of a V3 propositional 
formula [i^ to our problem. Let $ = Vj;i, . . . , Xn-^yi, ■ • • , Um-^' be a formula without 
free variables, where is a propositional formula. We consider the affine program G = 
{N,E, St), with where A'' = {st, 1,2}, and 

E = {(st, X := 0, 1), (1, s, 1), (1, X > 2", 2)} with 

s = x' :=x; {x' > 2"-i;x' := x'-2"-i;x„ := 1 \ x' < 2"~^ - l;Xn := 0); • • ■ 
(x' > 2^-^;x' := x' - := 1 \ x' < 2^"^ - l;xi := 0); 

s($'); X := x + 1 

The statement is defined as in the proof of Lemma [H 

In intuitive terms: this program initializes x to 0. Then, it enters a loop: it computes 
into xi, . . . ,Xn the binary decomposition of x, then it attempts to nondeterministically 
choose ui, ... ,ym so that (j)' is true. If this is possible, it increments x by one and loops. 
Otherwise, it just loops. Thus, there is a terminating computations iff <^ holds. 

Then $ holds iff V[2] ^ 0. For the abstraction, we consider the interval domain. By 
considering the Kleene-Iteration, it is easy to see that V[2]^$ holds iff V^[2] > — oo 
holds. Thus ^> holds iff V^2] > ^oo holds. □ □ 



4 Determining Improved Strategies 

In this section we develop a method for computing local improvements of strategies 
through solving SAT modulo real linear arithmetic formulas. 

In order to decide, whether or not, for a given statement s, a given j G {1, . . . , m}, 
a given c, and a given d G M^, Isfj.d > c holds, we construct the following SAT modulo 
real linear arithmetic formula (we use existential quantifiers to improve readability): 

^{s,d,j,c) ■.= 3v eM. . ^{s,d,j) Av > c 
$(s,d,j) := 3x G M",x' G M" . <dA<l>{s)Av = Tj.x' 
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3w e K . $(s,(0,0) ' ,1) Au > 

3x e M^, x' G . 2:1. < A -xi. < A $(s) Av = x[, 
3x" G . xi. < 1000 A x'l = xi. A x'l = X2. A x[. = x'(. A x'2. = -x'(. 
xi. < 1000 A x[, = Xi. A X2. = -Xi. 

3x" G . X2. < -1 A x" = xi. A X2. = X2. A x'^. = -2x'/. A Xj. = X2. 
X2. < — 1 A x[, = — 2xi. A X2. = X2. 

3x" G . - X2. < A x'l = xi. A X2. = X2. A x[. = -x" + 1 A Xj. = x'a'. 
X2. < A x[, = — Xi. + 1 A Xj. = X2. 

(ol A <5(si)) V (ai A <l'(s2)) = (oT A X2. < -1 A x[, = -2xi. A xf,. = X2.) 

V(ai A X2. < A x'^. = -xi. + 1 A Xj. = X2.) 
3x" G M2 . $(s')[x'7x'] A $(si I S2)[x'7x] 
xi. < 1000 A ((oT A -xi. < -1 A x'l. = -2xi. A X2. = -Xi.) 

V(ai A -xi. < A x[, = -xi. + 1 A Xj. = -Xi.)) 

Figure 2: Formula for Example [TT] 
Here, is a formula that relates every x £ M" with all elements from the set 
It is defined inductively over the structure of s as follows: 

$(x := Ax + b) := x' = Ax + b 

^{Ax < b) := Ax < b A x' = X 
$(si; S2) := 3x" G M" . <i>{si)[x" /x'] A <^{s2)[x"/x] 
^{si I S2) ■■= (apos(.i|s2) A «>(si)) V (apos(.i|s2) A ^(sa)) 

Here, for every position p of a subexpression of s, ap is a Boolean variable. Let Pos|(s) 
denote the set of all positions of | -subexpressions of s. The set of free variables of the 
formula ^>(s) is {x,x'} U {ap \ p S Pos|(s)}. A valuation for the variables from the set 
{ap I p € Pos|(s)} describes a path through s. We have: 

Lemma 9. [sj^.d > c holds iff ^{s,d,j,c) is satisfiable. □ 

Our next goal is to compute a V-strategy a for s such that JsfiJ^-.fi > c holds, provided 
that [sjjj.d > c holds. Let s be a statement, d € M™, j G {1, . . . , m}, and c G M. Assume 
that [slj.d > c holds. By Lemma [9l there exists a model M of $(s,d,j, c). We define 
the V-strategy cja/ for s by (Jm{p) '■= M{ap) for all p G Pos|(s). By again applying 
Lemma m we get [[scrj^-.d > c. Summarizing we have: 

Lemma 10. By solving the SAT modulo real linear arithmetic formula ^(s,d,j,c) 
that can be obtained from s in linear time, we can decide, whether or not [[sj^-.d > c 
holds. From a model M of this formula, we can obtain a V-strategy for s such that 
\scTM^j.d > c holds in linear time. □ 

Example 11. We again continue Example [1] and [71 We want to know, whether 
|s]J.(0,0)^ > holds. For that we compute a model of the formula (0,0)"'', 1,0) 
which is written down in Figure [21 M = {ai 1— )• 1} is a model of the formula 
$(5,(0,0)"^, 1,0). Thus, we have < [[scjm1i.(0, 0)^ = [s'; S2li.(0, 0)^ by Lemma 

m □ 



$(s,(0,0)T,l,0) = 
$(s,(0,0)T,l) = 

l'(si) = 

$(S2) - 
^•(51 I S2) = 

<l.(s) = 
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It remains to compute a model of <I>(s, d,j, c). Most of the state-of-the-art SMT solvers, 
as for instance Yices [l5| . support the computation of models directly; if unsup- 
ported, one can compute the model using standard self-reduction techniques. 

The semantic equations we are concerned with in the present article have the form 
X = ei V • • • V Cfc, where each expression e^, i = 1, . . . ,k is either a constant or an 
expression of the form |s]]j-.(xi, . . . ,Xm)- We now extent our notion of V-strategies in 
order to deal with the occurring right-hand sides: 

Definition 2 (V-Strategies). The V-strategy for all constants is the 0-tuple (). The 
application c() of () to a constant c G M is defined by c() := c for all c € M. A 
V-strategy a for an expression |s]]^-,(xi, • • • , x^) is a V-strategy for s. The applica- 
tion (|Is]]^-.(xi, . . . ,Xm))(T of a to W^-Xxi, . . . ,x„) is defined by {lsfj.{ Xi , . . . , Xj^ 
|s(j|^-.(xi, • • • , Xm). A V-strategy for an expression e = eo V ei,, where, for each 

i G {0,1}, Ci is either a constant or an expression of the form [[s]^-.(xi, • • • , x^), 
is a pair {p,cr), where p € {0,1} and a is a V-strategy for e^. The application 
e{p, a) of {p, o") to e = eo V ei is defined by e{p, a) = Cpcr. A V-strategy a for a 
system £ = {xi = ei,...,x„ = e„} of abstract semantic equations is a mapping 
{xj ^ (Ti \ i = l,...,n}, where crj is a V-strategy for Cj for all i = l,...,n. We 
set 8{a) := {xi = ei(fT(xi)), . . . ,x„ = e„(o-(x„))}. □ 

Using the same ideas as above, we can prove the following lemma which finally enables 
us to use a SAT modulo real linear arithmetic solver for improving V-strategies for 
systems of abstract semantic equations locally. 

Lemma 12. Let x = e be an abstract semantic equation, p a variable assignment, and 
c € M. By solving a SAT modulo real linear arithmetic formula that can be obtained 
from e, p and c in linear time, we can decide, whether or not JeJ/) > c holds. From a 
model M of this formula, we can in linear time obtain a y -strategy gm for e such that 
[e<TAf]/0 > c holds. □ 



5 Solving Systems of Concave Equations 



In order to solve systems of abstract semantic equations (see the end of Section [2]) we 
generalize the V-strategy improvement algorithm of Gawlitza and Seidl 2l| as follows: 



Concave Functions A set X C is called convex iff Ax + (1 — X)y G X holds for 
all x,y ^ X and all A G [0, 1]. A mapping / : X — > with X C convex is called 
convex (resp. concave) iff f{\x + (1 — X)y) < (resp. >) A/(x) + (1 — X)f{y) holds for all 
x,y (z X and all A G [0, 1]. Note that / is concave iff — / is convex. Note also that / is 
convex (resp. concave) iff /j. is convex (resp. concave) for alH = 1, . . . ,m. 

We extend the notion of convexity /concavity from to M — > M as follows: 

Let / : M — > M , and / : {1, . . . , n} — )• {— cxd, id, oo}. Here, — cxd denotes the function 
that assigns — oo to every argument, id denotes the identity function, and oo denotes 
the function that assigns oo to every argument. We define the mapping : R —^M 
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by . := . . . , /(n)(2;„)) for all xi, . . . , x„ G M. A mapping 

/ : M — )• M is called concave iff /j. is continuous on {x G M | fi.{x) > —00} for 
all i G {1, . . . ,Tn}, and the following conditions are fulfilled for all / : {1, . . . ,n} 
{—00, id, 00}: 

1. fdom(/(^)) IS convex. 

lfdom(/{-r)) IS concave. 

3. For all i G {1, . . . ,m} the following holds: If there exists some y G M" such that 
fPiy) G M, then fj:^\x) < 00 for aU x G M". 

A mapping / : M ^ R is called convex iff — / is concave. In the following we are only 
concerned with mappings / : M ^ M that are monotone and concave. 

We slightly extend the definition of concave equations of Gawlitza and Seidl [2H : 

Definition 3 (Concave Equations). An expression e (resp. equation x = e) over M is 
called basic concave expression (resp. basic concave equation) iff [e] is monotone and 
concave. An expression e (resp. equation x = e) over R is called concave iff e = \/ E, 
where £^ is a set of basic concave expressions. □ 

The class of systems of concave equations strictly subsumes the class of systems of 
rational equations and even the class of systems of rational LP-equations as defined by 



Gawlitza and Seidl [l7|,|2a] (cf. [2l[). 

For this paper it is important to observe that every system of abstract seman- 
tic equations (cf. Section [2]) is a system of concave equations: For every statement 
s, the expression [s]]^-.(xi, . . . , x^) is a concave expression, since (1) the expression 
([[s|^-.(^i) • • • 1 ^m))c is a basic concave expression for all V-strategies a, (i.e. [scr]]^-. is 
monotone and concave) and (2) the expression [s]]^-.(xi, . . . ,Xm) can be written as the 

expression Vo-esd'^lj-^^i' " " " ' ■'^"^))^- Here, S denotes the set of all V-strategies. Hence, 
we can generalize the concept of V-strategies as follows: 



Strategies A V -strategy a for £ is a function that maps every expression \/ E occur- 
ring in £ to one of the e € E. We denote the set of all V-strategies for £ by T,£. We 
drop subscripts, whenever they are clear from the context. For a G S, the expression 
ea denotes the expression cr(e). Finally, we set £{cr) := {x = ecr | x = e G £}. 



The Strategy Improvement Algorithm We briefly explain the strategy improve- 
ment algorithm (cf. [2ll . [i^]). It iterates over V-strategies. It maintains a current 
V-strategy and a current approximate to the least solution. A so-called strategy improve- 
ment operator is used for determining a next, improved V-strategy. In our application, 
the strategy improvement operator is realized by a SAT modulo real linear arithmetic 
solver (cf. Section 2]). Whether or not a V-strategy represents an improvement may 
depend on the current approximate. It can indeed be the case that a switch from one 
V-strategy to another V-strategy is only then profitable, when it is known, that the least 
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solution is of a certain size. Hence, we talk about an improvement of a V-strategy w.r.t. 
an approximate: 



Definition 4 (Improvements). Let £ he a system of monotone equations over a com- 
plete linear ordered set. Let a,a' £ T, be V-strategies for £ and p be a pre-solution of 
f (cj). The V-strategy a' is called improvement of a w.r.t. p iff the following conditions 
are fulfilled: (1) If p ^ Sol(<?), then [<?(o"')]]/3 > p. (2) For all \/-6xpressions e occurring 
in £ the following holds: If a'^e) ^ cr{e), then [eo"'|p > [eajp. A function Py which 
assigns an improvement of a w.r.t. p to every pair (cj, p), where a is a V-strategy and p 
is a pre-solution of <f (cr), is called V-strategy improvement operator. □ 

In many cases, there exist several, different improvements of a V-strategy a w.r.t. a 
pre-solution p of £{cr)- Accordingly, there exist several, different strategy improvement 
operators. One possibility for improving the current strategy is known as all profitable 
switches 0, [H]. Carried over to the case considered here, this means: For the im- 
provement a' of a w.r.t. p we have: |f (cr')]/? = ^£}p. i.e., a' re pre sents the best local 
improvement of a at p. We denote a' by P^^^^'^{a, p) H, 1^, 23 |. 



Now we can formulate the strategy improvement algorithm for computing least 
solutions of systems of monotone equations over complete linear ordered sets. This 
algorithm is parameterized with a V-strategy improvement operator Py. The input is 
a system £ of monotone equations over a complete linear ordered set, a V-strategy ciinit 
for £, and a pre-solution p\a\t of £'(a"init)- In order to compute the least and not some 
arbitrary solution, we additionally assume that pinit < l^W\ holds: 



Algorithm 1 The Strategy Improvement Algorithm 

{- A system £ of monotone equations over a complete linear ordered set 
- A V-strategy fiinit for £ 
- A pre-solution pinit of £{ai^it) with pinit < pl£} 
o" ^ o-init; P ^ Pinit; vi^hile (p ^ Sol(£:)) {a ^ Py{a,p); p ^ p>pl£{a)f,} return p; 



Lemma 13. Let £ be a system of monotone equations over a complete linear ordered 
set. For i € N, let pi be the value of the program variable p and ai be the value of the 
program variable a in the strategy improvement algorithm after the i-th evaluation of 
the loop-body. The following statements hold for all i € N.' 

1. Pi <pl£j. 2. Pi G PreSol(^(cJi+i)). 

3. If Pi < pl£j, then pi+i > pi. 4. If pi = pl£}, then pi+i = pi. □ 

An immediate consequence of Lemma [13] is the following: Whenever the strategy im- 
provement algorithm terminates, it computes the least solution pl_£} of £. 

At first we are interested in solving systems of concave equations with finitely many 
strategies and finite least solutions. We show that our strategy improvement algorithm 
terminates and thus returns the least solution in this case at the latest after considering 
all strategies. Further, we give an important characterization for ^>p[iS((j)]]. 
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Feasibility In order to prove termination we define the following notion of feasibility: 



Definition 5 (Feasibility ( 2l|)). Let £ he a system of basic concave equations. A finite 
solution p oi £ is called (£-)feasible iff there exists Xi,X2 Q X and some k G N such 
that the following statements hold: 

1. Xi U X2 = X, and Xi n X2 = 0. 

2. There exists some p' <\ p\xi such that p' U p|x2 is a pre-solution of £, and p = 

mHp'Oph,). 

3. There exists a p' < /9|x2 such that p' < (|^-]'^(p|xi U p'))lx2- 

A finite pre-solution /) of £" is called (£-)feasible iff ^>p[[f]] is a feasible finite solution 
of £. A pre-solution p < 00 is called feasible iff e = —00 for all x = e € f with 
{ejp = —00, and p\x' is a feasible finite pre-solution of {x = e G f | x G X'}, where 
X' := {x I X = e G lejp > -00}. 

A system £ of basic concave equations is called feasible iff there exists a feasible 
solution p of £. □ 

The following lemmas ensure that our strategy improvement algorithm stays in the 
feasible area, whenever it is started in the feasible area. 



Lemma 14 ( 2l|). Let £ be a system of basic concave equations and p be a feasible 



pre-solution of £. Every pre-solution p' of £ with p < p' < p>p\£^ is feasible. □ 



Lemma 15 ([2l[). Let £ be a system of concave equations, a be a \J -strategy for £ , p be 
a feasible solution of£{a), and a' be an improvement of a w.r.t. p. Then p is a feasible 
pre-solution of £{a'). □ 

In order to start in the feasible area, we simply start the strategy improvement algorithm 
with the system £ V —00 := {x = e V —00 | x = e G £}, a V-strategy cJinit for £ V —00 
such that {£ V — oo)((Tinit) = {x = — oo|x = eGf}, and the feasible pre-solution — cxp 

of {£ V -Oo)(o-init)- 

It remains to determine /x>p[[f]. Because of Lemma [E] and Lemma 1151 we are 
allowed to assume that p is a feasible pre-solution of the system £ of basic concave 
equations. This is important in our strategy improvement algorithm. The following 
lemma in particular states that we have to compute the greatest finite pre-solution. 



Lemma 16 ([2l||). Let £ be a feasible system of basic concave equations with e 7^ —00 
for all X = e G £. There exists a greatest finite pre-solution p* of £ and p* is the only 
feasible solution of £. If p is a finite pre-solution of £, then p* = p>p\£\. □ 



Termination Lemma [16] implies that our strategy improvement algorithm has to 
consider each V-strategy at most once. Thus, we have shown the following theorem: 

Theorem 17. Let £ be a system of concave equations with p\£^ <\ 00. Assume that 
we can compute the greatest finite pre-solution p^ of each £{a), if £{a) is feasible. Our 
strategy improvement algorithm computes p\£\ and performs at most + |X| strategy 
improvement steps. The algorithm in particular terminates, whenever S is finite. □ 
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6 Computing Greatest Finite Pre-Solutions 



For all systems £ of abstract semantic equations (see Section [2]) and all V-strategies a, 
£{(j) is a system of abstract semantic equations, where each right-hand side is of the 
form |s]^-.(xi, . . . ,Xm), where s is a sequential statement and xi, . . . ,Xm are variables. 
We call such a system of abstract semantic equations a system of basic abstract semantic 
equations. It remains to explain how we can compute the greatest finite solution of such 
a system — provided that it exists. 

Let £^ be a system of basic abstract semantic equations with a greatest finite pre- 
solution p* . We can compute p* through linear programming as follows: 

We assume w.l.o.g. that every sequential statement s that occurs in the right-hand 
sides of 8 is of the form Ax<h;x := A'x + 6', where A G M^^", h G M'^, A' € R"^", b' G 
M". This can be done w.l.o.g., since every sequential statement can be rewritten into 
this form in polynomial time. We define the system C of linear inequalities to be the 
smallest set that fulfills the following properties: For each equation 

x= I^x < b-x:= ^'x + 6']]J,(xi,...,x™), 

the system C contains the following constraints: 

X < Tj.A'{yi, . . . ,yriV + Tj-b' ^i.(yi, . . ■ ,yn)~'' < 5^ for all i = 1, . . . , A; 

7i.(yi, . . .,ynV < Xj for ah 2 = 1, . . . ,m 

Here, yi,...,yn are fresh variables. Then /5*(x) = sup {/9(x) | p G Sol(C)}. Thus 
p* can be determined by solving [X^| linear programming problems each of which can 
be constructed in linear time. We can do even better by determining an optimal so- 
lution of the linear programming problem supj^^gx^ P(^) I P € Sol(C)| . Then the 



optimal values for the variables x G X^- determine p* (cf . Gawlitza and Seidl 17|, [24] ) . 
Summarizing we have: 

Lemma 18. Let £ be a system of basic abstract semantic equations with a greatest finite 
pre-solution p* . Then p* can be computed by solving a linear programming problem that 
can be constructed in linear time. □ 

Example 19. We again use the definitions of Example [71 Consider the system £ of basic 
abstract semantic equations that consists of the equations 

xi,i = [s';s2li.(xi,i,xi,2) xi,2 = [[s';si]]|.(xi,i,xi,2), 

where s' := xi < 1000; X2 := —xi, si := X2 < —1; xi := —2x1, and S2 := —X2 < 0; xi := 
—xi + 1. Our goal is to compute the greatest finite pre-solution p* of £. Firstly, we 
note that [s'; S2I = |xi < 0;(xi,X2) := {-xi + 1,-xi)] and [[s'; sil = l{xi,-xi) < 
(1000, — 1); (xi, X2) := {—2xi,—xi)} hold. Accordingly, we have to find an optimal 
solution for the following linear programming problem: 

maximize xi 1 + xi 2 
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xi,i < -yi + 1 xi,2 < 2y'i yi < y'l < 1000 yi < xi,i 

-y'l < -1 -yi < xi,2 y'l < xi_i -y'^ < xi,2 

An optimal solution is xi,i = 2001, xi,2 = 2000, yi = -2000, and y'^ = 1000. Thus 
p* = {xi_i I-)- 2001, xi^2 ^ 2000} is the greatest finite pre-solution of £. □ 

Summarizing, we have shown our main theorem: 

Theorem 20. Let £ he a system of abstract semantic equations with Ijl\£\ <\ oo. Our 
strategy improvement algorithm computes and performs at most |S| + |X| strategy 
improvement steps. For each strategy improvement step, we have to do the following: 

1. Find models for |X| SAT modulo real linear arithmetic formulas, each of which 
can he constructed in linear time. 

2. Solve a linear programming problem which can be constructed in linear time. 

Proof. The statement follows from Lemmas [T^ [T^ [TUl [TBI and Theorem [T71 □ □ 

Our techniques can be extended straightforwardly in order to get rid of the pre-condition 
^i\£\ < oo. However, for simplicity we eschew these technicalities in the present article. 

7 An Upper Bound on the Complexity 

In Section [3l we have provided a lower bound on the complexity of computing abstract 
semantics of affine programs w.r.t. the template linear domains. In this section we show 
that the corresponding decision problem is not only n2-hard, but in fact Ilg-complete: 

Theorem 21. The problem of deciding, whether or not, for a given affine program G, 
a given template constraint matrix T, and a given program point v, V'^[v] > —oo holds, 
is in Ilg. 

Proof. (Sketch) We have to show that the problem of deciding, whether or not, for a 
given affine program G, a given template constraint matrix T, a given program point 
V, and a given i G {1, • • • ,Tn}, {V^[v])i. = —oo holds, is in 00—112 = = NP^^. In 
polynomial time we can guess a V-strategy a for £' := £{G) and compute the least 
feasible solution p of £'{a) (see Gawlitza and Seidl [l3])- Because of Lemma [H we 
can use a NP oracle to determine whether or not there exists an improvement of the 
strategy a w.r.t. p. If this is not the case, we know that p > fil£'} holds. Therefore, 
by LemmaEl we have p(x^^i) > Thus we can accept, whenever p{'x.y^i) = — oo 

holds. □ □ 

Finally, we give an example where our strategy improvement algorithm performs expo- 
nentially many strategy improvement steps. It is similar to the program in the proof of 
Theorem[8l For all n E N, we consider the program G.„ = {N, E, st), where = {st, 1}, 
E = {(st,xi := 0;yi := l;y2 := 2yi; . . . ;y„ := 2y„_i,l), (l,s, 1)}, and 

S = X2 := Xi; {X2 > yn;X2 := X2 - Vn \ X2 < Vn - I);- ■ ■ 

(x2 >yi;x2 := X2 - yi \ X2 < yi - I); xi := xi + 1. 
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It is sufficient to use a template constraint matrix that corresponds to the interval do- 
main. It is remarkable that the strategy iteration does not depend on the strategy 
improvement operator in use. At any time there is exactly one possible improvement 
until the least solution is reached. All strategies for the statement s will be encoun- 
tered. Thus, the strategy improvement algorithm performs 2" strategy improvement 
steps. Since the size of Gn is G(n), exponentially many strategy improvement steps are 
performed. 



8 Conclusion 



We presented an extension of the strategy improvement algorithm of Gawlitza and 
Seidl 17, [l^ . 21 1 which enables us to use a SAT modulo real linear arithmetic solver 
for determining improvements of strategies w.r.t. current approximates. Due to this 
extension, we are able to compute abstract semantics of affine programs w.r.t. the 



template linear constraint domains of Sankaranarayanan et al. 4^, where we abstract 
sequences of if-then-else statements without loops en bloc. This gives us additional 
precision. Additionally, We provided one of the few "hard" complexity results regarding 
precise abstract interpretation. 

It remains to practically evaluate the presented approach and to compare it sys- 
tematically with other approaches. Besides this, starting from the present work, there 
are several directions to explore. One can for instance try to apply the same ideas for 
non-linear templates 2l|, or to use linearization techniques [35|. 
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